feat(qwp): connect timeout, ingest callbacks, and lazy_connect (tolerant startup) on the QuestDB facade#60
feat(qwp): connect timeout, ingest callbacks, and lazy_connect (tolerant startup) on the QuestDB facade#60bluestreak01 wants to merge 34 commits into
Conversation
Establish a real, cross-platform connect timeout for the HTTP and WebSocket (QWP) transports. Previously a connect to a black-holed or firewalled host blocked on the OS-level TCP connect timeout (often 60-120s) because the socket was created blocking and only switched to non-blocking *after* connect; the transport exposed no knob to clamp it. Approach: a new native primitive switches the socket to non-blocking *before* connect, so connect() returns EINPROGRESS immediately, then polls for writability bounded by the caller's budget and confirms the outcome via SO_ERROR. A distinct return code (CONNECT_TIMEOUT, -3) lets the Java layer raise a timeout-flagged exception rather than decode errno. Native: - share/net.c: connectAddrInfoTimeout + awaitConnectComplete (poll + getsockopt(SO_ERROR), monotonic-clock EINTR handling) - windows/net.c: Winsock equivalent (select write/except sets) - share/net.h: ECONNTIMEOUT (-3) sentinel Java: - Net / NetworkFacade(Impl): connectAddrInfoTimeout + CONNECT_TIMEOUT - HttpClientConfiguration.getConnectTimeout() (default 0 = OS fallback) - HttpClient.connect() / WebSocketClient.doConnect() honor it and throw a timeout-flagged HttpClientException on CONNECT_TIMEOUT - Sender builder: connectTimeoutMillis() + connect_timeout connect-string key (legacy http and ws/wss parsers) + ConfigSchema COMMON key - QwpWebSocketSender / QwpQueryClient: thread the value through to their WebSocketClient (adds QwpQueryClient.withConnectTimeout) Default is unset (0): behaviour is unchanged unless connect_timeout is configured. Tests: NetConnectTimeoutTest covers loopback success, refused-vs-timeout disambiguation, and a black-hole timeout that fires within budget; config-honored drift guards updated for the new COMMON key.
On a runner with no route to TEST-NET-1 (192.0.2.0/24) connect() fails fast with ENETUNREACH instead of dropping the SYN, so the timeout path can't be exercised. Skip (Assume) in that case rather than asserting a timeout, while still proving the call never blocked on the OS connect timeout.
GitHub now forces actions onto Node 24 (glibc >= 2.27), which cannot run inside the manylinux2014 (glibc 2.17) container the linux-x86-64 native build used; actions/checkout failed before compilation. The old Node-20-glibc-217 override only patched /__e/node20, not /__e/node24. Switch the job to quay.io/pypa/manylinux_2_28_x86_64 (glibc 2.28, runs stock Node 24) and drop the Node hack, nasm src.rpm rebuild, and manual CMake download, mirroring the linux-aarch64 job that already builds on manylinux_2_28.
The pooled QuestDB facade built its ingest Senders from config strings only (SenderPool -> Sender.fromConfig), so the programmatic ingest callbacks -- SenderErrorHandler and SenderConnectionListener -- were unreachable: a facade user got the default loud-not-silent handlers with no way to observe async ingest errors or connection transitions. Expose both as QuestDBBuilder setters and thread them to every pooled Sender: - QuestDBBuilder.errorHandler(...) / .connectionListener(...) - QuestDBImpl gains a full constructor carrying the callbacks; the public constructor forwards them and the 12-arg white-box test-seam constructor is preserved as a delegating shim (null callbacks). - SenderPool gains a full constructor + applyUserCallbacks() that applies the callbacks to every sender it builds (both the non-SF and SF paths); the 8-arg test-seam constructor is preserved as a shim. Recovery delegates (internal, short-lived, OFF-mode drain senders) are deliberately excluded so the user's callbacks never see events from internal machinery. Defaults are null -> behaviour is unchanged unless a callback is set. Tests: QuestDBFacadeCallbacksTest prewarms one ingest sender at a dead port in async mode with a tight reconnect budget and asserts the facade-wired errorHandler receives the budget-exhaustion SenderError and the facade-wired connectionListener observes connection events -- no server required.
The QuestDB facade always built a reader (QueryClientPool), which prewarms synchronously and fail-fast (default query_pool_min=1, QwpQueryClient has no async connect). So a down server / read primary sank the whole facade build, taking the write side with it. Add QuestDBBuilder.writeOnly(): build an ingest-only handle that never constructs the query pool, so the read side cannot fail startup. A query config is no longer required in this mode (any query config set is ignored), and query()/newQuery() throw a clear "write-only" IllegalStateException. - QuestDBImpl gains a write-only public constructor + a writeOnly flag on the full constructor; the 12-arg white-box test-seam constructor stays unchanged (delegates with writeOnly=false). queryPool/queryThreadLocal are null in write-only mode. - PoolHousekeeper tolerates a null query pool. - QuestDBBuilder.buildWriteOnly() validates + resolves only the sender/shared pool knobs from the ingest config. Pair with initial_connect_retry=async (or sender_pool_min=0) on the ingest config so the write side does not fail-fast either -> the facade starts with no server present. Tests: QuestDBWriteOnlyTest proves the facade builds with no server, that query()/newQuery() are disabled, that no query config is required, and that an async warm sender can buffer a write while serverless.
…nnects End-to-end resilience test for the QuestDB facade: build with the server down (ingest initial_connect_retry=async + query_pool_min=0), buffer a write, then bring the server up and assert the write side reconnects and the previously-deferred reader connects on the first query. Uses two TestWebSocketServers bound-but-not-accepting to model a reachable -but-down server (handshakeCount stays 0 until start()). The mock cannot serve real SELECT rows, so the read step asserts the query client connects once the server is up, not the row contents. Stable across repeated runs.
Remove the committed Linux/Windows native binaries (libquestdb.so, libquestdb.dll) and compile them locally during the Azure test CI. - New ci/build_native.yaml template compiles libquestdb on the runner: Linux (cmake+nasm+build-essential) and Windows (MinGW-w64+NASM via choco). macOS keeps using the committed .dylib. Inits the zstd submodule first. - Output is copied into src/main/resources/.../bin/<platform>/ so mvn install packages it into the client jar for both client and OSS server tests; the loader also picks up the CMake bin-local output directly. - Wired the template into run_tests_pipeline.yaml before client install. Committed binaries are still produced by the release GitHub Action.
Remove the committed darwin-aarch64/darwin-x86-64 libquestdb.dylib and build them on the macOS runners, matching the Linux/Windows approach. No native binaries remain committed; all are compiled during the test CI. - build_native.yaml: add a macOS build step (brew cmake/nasm, MACOSX_DEPLOYMENT_TARGET=13.0), detect darwin-aarch64 vs darwin-x86-64 via uname -m, and copy the dylib into src/main/resources/.../bin/<platform>/. - Init the zstd submodule on all platforms (it was skipped on Darwin). Release artifacts are still produced by the release GitHub Action.
The macos-15 (x64) agent hardware no longer exists, so remove the mac-x64 matrix entry. macOS is now tested on mac-aarch64 only. The darwin-x86-64 .dylib is still produced by the release GitHub Action, and build_native.yaml keeps its uname-based arch detection so an x64 macOS runner would still build correctly if ever reintroduced.
The GitHub Actions build-jdk8 job ran the full test suite against the
committed native libraries, which are now removed. Without the .so the
io.questdb.client.std.{Os,Files,Unsafe,...} static initializers fail with
NoClassDefFound (1289 errors).
Compile the native .so from source first (zstd submodule + cmake/nasm/
build-essential), against the JDK 8 JNI headers, and copy it into
src/main/resources/.../bin/linux-x86-64 so it survives 'mvn clean' and loads
via the production bin/<platform> path. Update the now-stale comment.
glibc 2.17 moved clock_gettime() into libc under a new GLIBC_2.17 version node. Building the release .so in a modern container (manylinux_2_28) binds clock_gettime@GLIBC_2.17, which raises the whole library's glibc floor to 2.17 and breaks loading on glibc 2.14-2.16 hosts. Add src/main/c/share/glibc_compat.h with a .symver directive forcing the reference back to clock_gettime@GLIBC_2.2.5 (x86-64 glibc only; no-op on aarch64/macOS/Windows), include it from net.c and os.c, list it in the CMake sources, and document the glibc floor in rebuild_native_libs.yml.
The Coverage Report job runs 'mvn -P jacoco test' on core but had no native build step, so after dropping the committed binaries it failed to load libquestdb.so (NoClassDefFound in io.questdb.client.std.*). Add the build_native.yaml template before the coverage test run, matching the BuildAndTest job. The job runs on Linux, so it compiles libquestdb.so.
Collapse the dual ingest/query config surface on the QuestDB facade into a single configuration string for the whole cluster. A QuestDB cluster is one logical target reached over QWP for both ingest and query, so one ws/wss string -- listing every node in a single `addr` server list -- now drives both the sender and query pools. - QuestDBBuilder: drop ingestConfig()/queryConfig(); fromConfig() sets the one cluster config. Remove the cross-side pool-key conflict resolution (no two strings to reconcile) -- resolvePoolInt/Long read one ConfigView. build() validates the single string with both the ingest and egress validators; each side applies the keys it owns and ignores the rest. - QuestDB: remove the connect(ingest, query) overload; connect(config) and builder() now document the one-config/server-list model. - QuestDBImpl is unchanged: the builder passes the one config to both pool slots, preserving the white-box reflection seam. Tests: TestWebSocketServer now serves both pools from one config like a real node -- SERVER_INFO is emitted only on the egress /read path (the ingest /write ACK stream would choke on it), plus a setRejectReadUpgrade() toggle to fail just the query upgrade. Rewrote QuestDBBuilderTest and updated the facade callback/recovery/write-only tests and the examples accordingly.
…n-string key Make writeOnly() deliver its own promise and reach it from the connect string. Previously "start even when the server is down" needed two knobs that look unrelated: writeOnly() (skip the fail-fast read pool) plus initial_connect_retry=async (keep the write prewarm from fail-fast-ing). The former governs the read side, the latter the write side, so writeOnly() alone still hard-failed build() when sender_pool_min >= 1 and the server was down. - writeOnly() now defaults the ingest side to a non-blocking async initial connect (injected right after the schema so an explicit initial_connect_retry in the user's string still wins, last-write-wins). build() returns promptly with the server down and the sender pool warm; writes buffer until the wire comes up. - New POOL-side connect-string key write_only=on, equivalent to .writeOnly(), so the mode is reachable from any config string (and QuestDB.connect). The two ws clients ignore it; the facade routes on it. Tests: writeOnly() with sender_pool_min defaulting to 1 and no initial_connect_retry now builds without fail-fast; write_only=on routes to the ingest-only path via builder and via connect(). PoolConfigHonoredTest's drift guard skips the routing flag (not a numeric sizing knob).
…nabled Strengthen the server-recovery test to assert what the write-only mode is NOT: on a normal facade built while the server is down (lazy read pool via query_pool_min=0, async ingest), query() must still hand back a usable builder *before* the server is up -- reads are enabled, just deferred -- and the deferred reader connects on the first submit once the server comes up. This is the read-capable counterpart to write-only, where query() throws for the life of the handle.
…ant startup)
Drop write-only mode (it permanently disabled reads -- query()/newQuery()
threw for the life of the handle) in favour of a read-capable tolerant-startup
flag, lazy_connect, reachable from the connect string.
lazy_connect=true:
- a) starts even when the server is down -- the ingest side connects async and
the read pool defaults to query_pool_min=0, so neither side fail-fasts;
- b) buffers writes while the server is down (async sender);
- c) reads once the server is up -- the read pool stays ENABLED and connects
lazily on the first query.
Because both sides must start non-blocking, a knob that forces a blocking /
fail-fast startup is a configuration conflict, rejected up front with a clear
remedy:
- initial_connect_retry other than async (off/false/on/true/sync), and
- an explicit query_pool_min > 0 (connect string or builder call).
Changes:
- ConfigSchema: write_only -> lazy_connect (Side.POOL; both clients ignore it).
- ConfigView.getBool: accept true/false (and on/off).
- QuestDBBuilder: remove writeOnly()/buildWriteOnly(); build() resolves
lazy_connect, validates the two conflicts, defaults query_pool_min to 0 and
injects initial_connect_retry=async when unset.
- QuestDBImpl: remove the write-only constructor/flag and requireQueryEnabled;
the query pool is always built (the white-box reflection seam is unchanged).
- Tests: QuestDBWriteOnlyTest -> QuestDBLazyConnectTest (start+write while down,
reads stay enabled, both conflicts via string and builder, true/on parsing);
QuestDBServerRecoveryTest now dogfoods lazy_connect=true for the full
down->write->up->read lifecycle; PoolConfigHonoredTest drift guard skips the
flag.
…e timeout QwpQueryClient.runUpgradeWithTimeout wrapped connect() and upgrade() in one try block, so a connect_timeout overage -- the timeout-flagged HttpClientException from doConnect()'s CONNECT_TIMEOUT path -- was caught by the isTimeout() branch meant for upgrade() and rewritten as "WebSocket upgrade to host:port exceeded auth_timeout=<authTimeoutMs>ms". A user with connect_timeout=500 and auth_timeout_ms=15000 saw, after ~500ms, an error blaming a 15000ms auth timeout (wrong phase and wrong value). Move connect() outside the upgrade try so the auth_timeout rewrite only applies to genuine upgrade-phase timeouts; connect-phase failures propagate with their own "connect timed out ..." message. The failover walk is unchanged (the exception is still a transport error and the next endpoint is tried). The ingest side (QwpWebSocketSender) was already correct -- it routes through QwpUpgradeFailures.classify, which leaves the connect-timeout exception unmodified. Add QwpQueryClientConnectTimeoutTest: a TEST-NET-1 blackhole connect with connect_timeout < auth_timeout must report connect_timeout, not auth_timeout. It skips gracefully when the runner has no route to the blackhole, mirroring NetConnectTimeoutTest. Verified it fails on the pre-fix code with the exact misreported message. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nly removal
The PoolHousekeeper reap loop wrapped queryPool.reapIdle() in an
`if (queryPool != null)` guard whose comment ("null for a write-only
handle") described write-only mode. That mode was removed in the
lazy_connect change (7491d95): QuestDBImpl now builds the query pool
unconditionally and is the sole PoolHousekeeper caller, so the field is
never null in a live handle. The null branch is unreachable and the
comment is stale -- drop both. The outer best-effort Throwable catch
stays; it has nothing to do with write-only.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ConfigView.getBool accepts true/false and on/off, but its invalid-value error read "(expected true, false)", under-reporting the accepted forms (e.g. lazy_connect=on is valid yet the message implies otherwise). List all four, matching getBoolOnOff's convention of naming exactly what it accepts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reshape the QuestDB facade so reads and writes share one pooled-lease model, and remove the thread-affine footguns on both sides. Ingest: - Remove QuestDB.sender() and releaseSender(), along with the entire thread-pin subsystem behind them (SenderPool.pinToCurrentThread, releaseCurrentThread, clearPinIfCurrent, the threadAffine ThreadLocal, and the PooledSender invalidated flag that existed only to make pinning safe). borrowSender() is now the only way to lease a Sender. Egress: - Add QuestDB.borrowQuery(), a closeable, non-allocating Query lease that mirrors borrowSender(). Each pooled QueryWorker owns one pre-allocated QueryImpl, handed out reset on borrow; submit() dispatches on the held worker (single-flight) and close() returns it to the pool. The worker no longer auto-releases per query. - Remove query(), newQuery(), and executeSql(). Reads now connect at borrow time rather than submit time; under lazy_connect the read pool still defaults to min=0, so build() does not fail-fast while the server is down. Test seams: - Make the white-box seam constructors public and annotate @testonly where production never calls them (QuestDBImpl, SenderPool). The QueryClientPool connectHook ctor stays public without @testonly because QuestDBImpl constructs the query pool through it. Tests now call these constructors directly instead of via reflection. Update the client tests, the usage example, and the startup/failover design doc to the new API.
try (server) on an effectively-final existing variable is Java 9+ syntax (JEP 213) and fails the JDK 8 test-compile (the source-of-truth target) with -source 1.8, breaking the build-jdk8 CI job and the release build before any test runs. Inline the resource construction into the try-with-resources declaration, which is valid on Java 8 and keeps the server variable name used throughout the body.
…se-after-close
A pooled Query/Sender handle was the reused per-slot object itself, guarded
only by a non-volatile in-use/borrowed flag. Once a worker/slot was released
and re-borrowed, that flag flips back to "live", so a stale handle's
close()/cancel()/write would leak into a *different* borrow: a duplicate close
double-released the worker/slot (enqueued twice -> two concurrent borrowers on
one non-thread-safe client/delegate), and a cached Completion.cancel() or stale
write hit whatever borrow now owns it. Idempotent close() and no-op cancel()
are documented contracts, so this was reachable from contract-legal code, not
just misuse, with pool-wide blast radius and no -ea guard.
Fix: give every borrow its own immutable generation, stamped under the pool
lock when the worker/slot is handed out and bumped again when it is returned.
The reused state stays on the slot; callers get a thin per-borrow handle that
carries the generation and validates it on every operation:
- close()/cancel() are no-ops on a stale generation (idempotency preserved),
- submit()/data writes throw,
- release/giveBack/discardBroken re-check the generation under the pool lock
so a worker/slot can never be enqueued twice, plus an -ea assert that it is
not already in the available deque.
Egress: QueryImpl stops being the user-facing Query; new QueryLease wraps it.
Ingest: new SenderSlot is the reused slot; PooledSender becomes the per-borrow
wrapper (keeps the public name, so borrow() still returns it). The per-submit
path stays allocation-free; only the small lease handle is created per borrow
(routinely scalar-replaced under try-with-resources).
Adds QueryLeaseGenerationTest and SenderLeaseGenerationTest covering the
double-release and cross-borrow cancel/write paths; updates the white-box
tests to the new shapes. Full core suite green under -ea (the lone failure is
the unrelated pre-existing FilesTest M2, which fails identically on master).
QueryWorker.runLoop() consumed the dispatch hand-off (q = current) under signalLock but cleared the slot (current = null) only after runOn() returned, outside the lock. A Query lease is single-flight but reused: the user thread loops submit() -> await() on the same handle. The terminal callback inside runOn() wakes the user thread, which can call submit() -> dispatch() -- setting current = q and signalling -- before the worker thread reaches its post-run finally block. That stale current = null then clobbered the freshly dispatched job and discarded its already-consumed signal, so the worker parked forever on the condition while the user thread blocked on a Completion that never fired. The borrowed worker never returned to the pool and the caller hung indefinitely. Clear current under signalLock at the moment of consumption and drop the post-run finally clear. dispatch() now cannot be clobbered: by the time the next dispatch runs, the worker is either already awaiting (so the signal wakes it) or will observe current != null on the while check and skip awaiting. The exception path leaves current already null, and the shutdown branch still clears under the lock. Surfaced as a 60s hang in QuestDBFacadeE2ETest.testSustainedMixedConcurrency (more threads than pool slots, repeated submit/await per lease). Was intermittent and timing-sensitive, so it showed up mainly on aarch64 CI; reproduced locally on x86 about one run in four, and 15/15 clean with this fix.
borrowQuery() returns a thin Query lease that is freshly allocated on every borrow, while the heavy state it delegates to -- the per-worker QueryImpl -- is pre-allocated once and reused across borrows. Nothing pinned that the fresh wrapper actually points back at the one pooled QueryImpl, so a regression that allocated a new QueryImpl per borrow (or dropped the worker's reuse) would have gone unnoticed here. Add testLeaseWrapsSamePooledQueryImpl: two lease() calls on the same worker must return distinct wrappers (assertNotSame) that delegate to the same pooled QueryImpl (assertSame on the reflected impl field). lease() never dereferences the client or pool, so the worker is built with nulls, mirroring the null-worker shortcut the reset test already uses.
QueryImpl.cancel(gen) validated the lease generation with an unlocked volatile read and then issued the wire cancel with no lock, so the two steps were not atomic. cancel() is a cross-thread watchdog/timeout API, so this is a TOCTOU: a watchdog can pass the generation check, get preempted while the lease is released (close -> release bumps the generation) and the worker is re-borrowed and re-submitted by another caller, then resume and call cancelInFlight() -- aborting that caller's in-flight query with a spurious STATUS_CANCELLED. The client never resets its cancel state on release/re-borrow, so the stale cancel lands either via the wire requestCancel(currentRequestId) or via the surviving pendingCancel latch the next executeOnce consumes. Route the cancel through QueryClientPool.cancelIfCurrent(worker, gen), which re-checks worker.generation() == gen and issues the wire cancel together under the pool lock -- the same lock acquire()/release() bump the generation under. Once cancelIfCurrent holds the lock the generation cannot change, so a cancel whose lease has gone stale is dropped instead of hitting the new borrower. The cancel itself is non-blocking (a volatile flag plus an AtomicLong set), so the lock is held only briefly. cancel() keeps a cheap unlocked fast-path that drops an obviously-stale or already-done cancel without taking the lock; the authoritative check stays under the lock. close() routes its abort-on-unawaited-close through the same path, closing the identical window on its cancelInFlight() call. QueryLeaseGenerationTest.testStaleCancelDoesNotReachClient only covered the already-stale case (it pre-advanced the generation, leaving no check->act window), so it could not catch this. Add testConcurrentCancelDoesNotReachClientAfterReborrow: it holds the pool lock so a concurrent cancel parks inside the re-check, advances the generation (release + re-borrow) under the lock, then releases it and asserts the parked cancel observed the new generation and never reached the client. The new test fails against the unlocked check-then-cancel and passes with the locked path.
JavaTlsClientSocket.startTlsSession ran the TLS handshake with raw delegate.recv/delegate.send on a non-blocking socket and never waited on socket readiness. A peer that completed the TCP connect but stalled before its half of the handshake left the engine in NEED_UNWRAP with recv returning 0 (would-block), so the loop re-read in a tight cycle: a 100% CPU busy-spin with no deadline. connect_timeout bounded only the TCP connect, and the WebSocket upgrade's auth/request timeout never covered the handshake, so a stalled wss:// (e.g. QuestDB Cloud) handshake could pin a core indefinitely and defeat a bounded connect. Drive the handshake through the client's existing deadline-aware ioWait, the same primitive recvOrDie/doSend already use. When the socket would block, the handshake hands control to a SocketReadinessWaiter that parks on epoll/kqueue/select for the remaining connect budget and throws a timeout-flagged exception once it is spent. This removes the spin and bounds the handshake in one change: both NEED_UNWRAP (recv == 0) and the NEED_WRAP send loop (send == 0) now wait instead of spinning. doConnect now calls setupIoWait() before the handshake (so the fd is registered when the waiter parks) and bounds the handshake by connect_timeout, falling back to the request timeout when connect_timeout is unset, so the handshake can no longer hang or spin even with the default config. The connect TLS block disconnects on any handshake error (including the waiter's timeout) so the fd and native buffers do not leak. Both HttpClient (https) and WebSocketClient (wss) connect paths share the fix. Extracted the handshake loop into runHandshake(waiter) so a stub SSLEngine can exercise the wait paths. Added two tests: testHandshakeWaitsForReadabilityInsteadOfBusySpinning drives a stalled peer and asserts the handshake yields to the waiter exactly once (a method-level timeout fails the test if the spin returns), and testHandshakeCompletesWithoutWaitingWhenEngineMakesProgress guards the happy path.
awaitConnectComplete reset its time baseline (start = now) on every EINTR and subtracted only whole milliseconds of elapsed time. Under a high-frequency signal storm on the connecting thread -- e.g. a wall-clock profiler or interval timer interrupting the blocked poll() more than once per millisecond -- each interval truncated to 0 ms, so the budget never decremented and poll() was re-armed with the full timeout every iteration. The connect timeout could then extend well past its bound, or never fire at all, contradicting the comment that EINTR storms cannot extend it. Even at lower rates each interrupt discarded up to ~1 ms of accounting, drifting the timeout one-directionally. Compute one absolute monotonic deadline up front and derive the remaining poll() budget from it each iteration. The remaining time can only decrease, so the timeout is a strict upper bound regardless of interrupt frequency; truncation now only under-shoots the final poll by < 1 ms, which never extends the wait. The success, refused, and timeout paths are otherwise unchanged. Validated: NetConnectTimeoutTest (loopback success, refused-vs-timeout, black-hole timeout within budget) passes against the rebuilt library. A standalone harness driving the old vs new logic under a simulated 2.5 kHz EINTR storm confirms the old logic runs unbounded (aborted at >10x the budget) while the new logic returns at the budget. A deterministic regression test at the JNI layer is not practical: Java cannot target a POSIX signal at the specific connecting thread, and the only hanging- connect fixture available (TEST-NET-1 black-hole) is routing-dependent and already Assume-skipped on most runners. Rebuilds the committed linux-x86-64 libquestdb.so from the net.c change (mirrors the copy step in ci.yml); other platforms are refreshed by the rebuild-native-libs workflow.
The build_native.yaml step comment claimed "macOS uses the committed .dylib and is skipped inside the template." Both halves are now false: this branch dropped all committed dylibs and added a macOS build step to build_native.yaml (an active Darwin-conditioned step), so the dylib is compiled fresh on the mac-aarch64 agent like the .so/.dll. The stale comment could mislead a maintainer into skipping macOS in the template, which would hard-break the mac leg since no committed dylib remains. Reword to state that every platform's binary is built on its own native agent and none are committed.
Result handlers (onBatch/onEnd/onError) run inline on the worker's dispatch thread: QueryWorker.runLoop -> QueryImpl.runOn -> QwpQueryClient.execute consumes the I/O thread's events and invokes the handler on the worker thread, then signalDone (which sets done) runs only when that same thread loops back. So a handler that called the lease's blocking close() or await() parked the worker thread waiting for a done that only it could later produce -- a permanent, uninterruptible self-deadlock that also leaked the worker and hung any other thread awaiting the same handle. close()'s awaitUninterruptibly made it unrecoverable, and close() is now mandatory via try-with-resources, so the foot-gun was one stray call away. Add a worker-thread reentrancy guard: QueryWorker.isCurrentThreadWorker() compares Thread.currentThread() to the worker's dispatch thread, and QueryImpl.close()/await()/await(timeout) throw IllegalStateException up front when called on it. The exception unwinds at the user's call site with a message pointing to cancel() (the non-blocking in-handler stop); the worker is released normally by the app-thread close() afterwards, so no deadlock and no leak. Also fix the docs the bug exposed. Query.handler, Completion, and the QueryWorker class javadoc all claimed the handler runs on "the I/O thread"; it runs on the worker (dispatch) thread, which consumes the I/O thread's event queue inline. The handler/close()/await() javadocs now say so and forbid blocking calls from a handler. Tests: QueryWorkerTest.testCloseAndAwaitFromWorkerThreadThrowInsteadOf- Deadlocking drives close()/await()/await(timeout) on the worker thread and asserts IllegalStateException (a method-level timeout fails the test if the guard regresses into the old deadlock; verified it times out without the guard). The QuestDB facade E2E suite adds an end-to-end check that a real handler calling close()/await() throws and leaves the worker reusable.
testBrokenSenderIsNotReturnedToPool asserted assertNotSame(first, second) on the two PooledSender wrappers. SenderPool.borrow() allocates a fresh PooledSender on every call, so that comparison is unconditionally true and proves nothing: it stays green whether or not the broken slot was discarded. With the discard logic reverted the test failed only incidentally, when second.close() re-threw on the recycled broken delegate, masking the real intent. Compare the underlying SenderSlot instead via the existing slotOf() helper, mirroring testBorrowReturnRecyclesSameDecorator. The pool recycles slots, not wrappers, so a broken slot leaking back to the next borrower now surfaces as the same slot and fails the assertion directly. The finally swallows the incidental close() exception so the assertion result is what surfaces. Verified by injecting the bug (giveBack instead of discardBroken on flush failure): the corrected assertion fails as a clean Failure at the assertion line, and passes with correct code. Also document the QueryWorker lost-dispatch coverage boundary in QueryWorkerTest: the single-flight-reuse race fix has no deterministic unit reproduction here because it needs the worker mid-runOn(client) when the user thread re-dispatches, which requires a live query client. That regression is guarded end-to-end by QuestDBFacadeE2ETest.testSustainedMixedConcurrency in the parent repo; testShutdownRacingDispatchMustNotStrandCaller covers only the adjacent shutdown branch.
Query.close() used to drain an in-flight submit with an unbounded, uninterruptible wait: while (!done) doneCondition.awaitUninterruptibly(). Because the lease model makes close() mandatory (try-with-resources), ordinary code that bounded its own await(timeout) and gave up still hit this drain and could block the caller for the full remaining query duration when the server was slow to honor the cancel. Worse, when a QuestDB.close() raced an in-flight lease close() and the client I/O thread failed to join within shutdownJoinMs, QwpQueryClient.close() skipped closePool() -- the synthetic-terminal source -- so done was never set and the lease close() hung forever. Make the drain bounded and interruptible, and fail safe on timeout: - QueryImpl.close() now waits at most closeQueryTimeoutMillis via awaitNanos and aborts on interrupt (re-raising the flag). A caller is never pinned to the full query duration. - On timeout or interrupt the worker is discarded, not returned: its connection may still carry late RESULT_* frames for the abandoned query, which would corrupt the next borrower's stream. The pool grows a fresh worker on the next borrow. QueryClientPool.discard() mirrors the ingest side's discardBroken (closed/stale-generation guarded). - QueryWorker.shutdown() now interrupts the dispatch thread. takeEvent() (QwpSpscQueue.take) is interrupt-aware and executeOnce() turns the InterruptedException into a terminal -> signalDone, so a caller parked in close() is released even when the I/O thread is wedged and closePool() never runs. This closes the hang-forever race and makes QuestDB.close() more prompt. Add the query_close_timeout_ms pool knob (default 5000ms, symmetric with the ingest close_flush_timeout_millis) via QuestDBBuilder.queryCloseTimeoutMillis(long), wired through QuestDBImpl to the pool and guarded by PoolConfigHonoredTest's drift check. Update the Query.close() Javadoc to state the bounded/interruptible/discard contract. QueryCloseDrainTest covers the new behavior deterministically with a no-op connect hook: close() returns within the budget and discards a worker that does not drain, an interrupt aborts the drain promptly, and an already-drained worker is returned for reuse. The dispatch-thread interrupt's effect on a genuinely stuck execute() is verified by construction here; its end-to-end reproduction lives in the parent repo's concurrency tests.
🤖 Automated code review — level-3 multi-agent pass
Verdict: one blocking item (C1), plus test-hardening. Otherwise solid — Java-8-clean, no production resource leaks, no broken out-of-diff callsites, no data-path perf regressions; the M1 double-close and watchdog-TOCTOU fixes have deterministic regression tests (verified by reverting them). 🔴 BlockingC1 — The outer
The strand logic at Fix: change 🟠 Moderate
🟡 Minor / nits
ProcessThe PR bundles ~6 semi-independent concerns plus a CI/native overhaul. Consider splitting the §5 lease/concurrency core and the CI/native build changes for isolated review and bisectability. The title omits §5 (the riskiest part — M1 double-close, the TOCTOU cancel, C1); surface it in the squash message since branch history is throwaway. Generated by Claude Code (multi-agent review). Findings were verified against source; ~6 draft findings were dropped as false positives. Treat C1 as the one must-fix; the rest is polish / test-hardening. |
QueryWorker.runLoop checked `while (!shuttingDown)` at the top of the
loop. After runOn() returned, control hit that check without taking
signalLock or re-reading `current`, so QuestDB.close() shutting down a
borrowed, busy worker could exit the loop with a job already pending:
1. The worker is inside runOn(q1); the terminal callback fires inline,
signals done, and wakes the app thread.
2. The app thread's await() returns and it submits again on the same
reused lease. dispatch() reads shuttingDown==false, sets current=q2.
3. QuestDB.close() -> queryPool.close() -> worker.shutdown() flips
shuttingDown to true (close() snapshots every worker in `all`,
borrowed ones included).
4. The worker returns from runOn(q1), re-checks the loop top, sees
shuttingDown==true, and exits -- without re-inspecting current.
q2 is then never run, never stranded, and never signalled, so the app
thread's next await() blocks forever. dispatch()'s own shuttingDown check
does not catch it (dispatch won the race and read false), the shutdown
interrupt never reaches it (q2 was never handed to the client), and
client.close()'s synthetic terminal cannot reach it either.
The strand logic that signals a pending current already lives inside the
signalLock block (the `if (shuttingDown) return` branch). Looping
unconditionally routes every shutdown ordering through that single strand
point, and the loop still terminates via the same return. The existing
QueryWorkerTest.testShutdownRacingDispatchMustNotStrandCaller only drove
the parked-worker variant, so this busy-worker path was untested.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
[PR Coverage check]😍 pass : 278 / 435 (63.91%) file detail
|
tandem OSS PR: questdb/questdb#7341
Summary
Related ergonomics/resilience improvements for the QWP (WebSocket) client:
QuestDBfacade —errorHandler/connectionListener, previously unreachable from the pooled facade.lazy_connect=true— start the handle even when the server is down, buffer writes meanwhile, and read once it's up. Reads stay fully enabled.ws/wssstring (a singleaddrserver list) configures the whole cluster, driving both the ingest and query pools.borrowQuery()mirrorsborrowSender()), and a per-borrow generation closes a family of pool-corruption bugs (double-close, use-after-close, lost dispatch, stale cancel). This is internal hardening — see §5.Items 1–3 are independently usable and off by default.
1. Configurable TCP connect timeout
A connect to a black-holed/firewalled host blocks on the OS-level TCP connect timeout (60–120s): the socket is created blocking,
connect()runs, then it's switched to non-blocking. The code calls this out:Approach (native, cross-platform): non-blocking
connect()(EINPROGRESS) →poll/selectfor writability bounded by the caller's budget → confirm viagetsockopt(SO_ERROR). A sentinel (CONNECT_TIMEOUT = -3) lets Java raise a timeout-flagged exception. Generalises the existinghandleEintrInConnecthelper.Touches: native
share/net.c+windows/net.c+net.h;Net/NetworkFacade(Impl);HttpClientConfiguration.getConnectTimeout();HttpClient.connect()/WebSocketClient.doConnect();ConfigSchemaCOMMON keyconnect_timeout;Senderbuilder + both parsers;QwpWebSocketSender/QwpQueryClient(withConnectTimeout). Bounds the TCP connect and the TLS handshake (see below); the WebSocket upgrade stays under the request/auth timeout (auth_timeout_ms).1b. Bound the TLS handshake (and stop it busy-spinning)
JavaTlsClientSocket.startTlsSessionran the TLS handshake with rawrecv/sendon the now-non-blocking socket and never waited on readiness. A peer that completed the TCP connect but stalled before its half of the handshake left the engine inNEED_UNWRAPwithrecvreturning 0 (would-block), so the loop re-read in a tight cycle: a 100% CPU busy-spin with no deadline.connect_timeoutbounded only the TCP connect, and the upgrade's request/auth timeout never covered the handshake, so a stalledwss://host (e.g. QuestDB Cloud) could pin a core indefinitely and defeat the bounded connect. The busy-spin pre-dates this PR (it predates theconnect_timeoutwork), but the same change closes it.The handshake now runs through the client's existing deadline-aware
ioWait(the same primitiverecvOrDie/doSenduse): when the socket would block it parks on epoll/kqueue/select for the remaining connect budget and throws a timeout-flagged exception once it is spent.doConnectregisters the fd with the event loop before the handshake and bounds it byconnect_timeout, falling back to the request timeout whenconnect_timeoutis unset — so the handshake can no longer hang or spin even with the default config. BothHttpClient(https) andWebSocketClient(wss) share the fix, and the connect TLS block disconnects on any handshake error so the fd/native buffers do not leak.Tradeoff: with
connect_timeoutunset, awss://handshake that stalls now fails after the request timeout instead of spinning/hanging forever. That is strictly better, but it is a behavior change for that edge.Touches:
JavaTlsClientSocket(handshake extracted intorunHandshake), newSocketReadinessWaiter,Socket/PlainSocket,HttpClient.connect()/WebSocketClient.doConnect(). Tests:JavaTlsClientSocketTest.testHandshakeWaitsForReadabilityInsteadOfBusySpinning(a stalled peer must yield to the readiness waiter, not spin; a method-level timeout fails the test if the spin returns) andtestHandshakeCompletesWithoutWaitingWhenEngineMakesProgress(happy path).2. Ingest callbacks on the
QuestDBfacadeThe facade built ingest senders from config strings only (
SenderPool → Sender.fromConfig), so the programmaticSenderErrorHandler/SenderConnectionListenerwere unreachable — a facade user got the default loud-not-silent handlers with no way to observe async ingest errors or connection transitions.QuestDBImpl/SenderPooleach gain a full constructor carrying the callbacks; the white-box test-seam constructors are preserved as delegating shims.SenderPool.applyUserCallbacks()applies them to every pooled sender (non-SF and SF paths); internal recovery delegates are excluded. Defaultsnull.3. Tolerant startup:
lazy_connect=trueThe facade prewarms a reader (
QueryClientPool) synchronously and fail-fast (defaultquery_pool_min=1; queries have no async connect), so a down server failed the whole build. The fix is a single connect-string flag that makes the handle tolerate a down server without giving up reads — "starts when the server is down" and "never reads" are different things, and you almost always want the first.lazy_connect=true:query_pool_min=0, so neither side fail-fasts andbuild()returns promptly;query_pool_min=0, so nothing connects eagerly andborrowQuery()connects lazily on first use. While the server is still down aborrowQuery()throws (it has nowhere to connect — exactly like any read against a down server); once the server is up it connects and reads. The point is thatlazy_connectdefers the read connect rather than refusing reads.Because both sides must start non-blocking, a knob that forces a blocking / fail-fast startup is a configuration conflict, rejected up front with a clear remedy rather than silently overridden:
initial_connect_retryother thanasync(i.e.off/false/on/true/sync), andquery_pool_min > 0(connect string or builder call).lazy_connectis aSide.POOLregistry key — the two ws clients ignore it; the facade reads it, defaultsquery_pool_minto 0, and injectsinitial_connect_retry=asyncwhen the user set none.4. Single cluster config on the facade
A QuestDB cluster is one logical target reached over QWP for both ingest and query, so the facade takes one cluster config: a single
ws/wssstring that lists every node in oneaddrserver list and drives both the sender and query pools.build()validates the one string with both the ingest (validateWsConfigString) and egress (QwpQueryClient.validateConfig) validators; each side applies the keys it owns and ignores the rest. Pool keys are read from a singleConfigView, andQuestDBImplpasses the one config to both pool slots (preserving the white-box reflection seam).5. Pooled-handle lifecycle: lease/generation refactor + correctness fixes
This is the bulk of the diff and it is internal (no user-facing API beyond the borrow methods). It reshapes the facade so reads and writes share one pooled-lease model and fixes a family of pool-corruption bugs that the previous thread-affine / flag-guarded design allowed.
Refactor — symmetric pooled lease (
1245c17). Reads and writes now use one model:QuestDB.sender()/releaseSender()and the entire thread-pin subsystem behind them (pinToCurrentThread,releaseCurrentThread,clearPinIfCurrent, thethreadAffineThreadLocal, thePooledSenderinvalidated flag).borrowSender()is now the only way to lease aSender.query(),newQuery(), andexecuteSql(); addQuestDB.borrowQuery(), a closeable, non-allocatingQuerylease that mirrorsborrowSender(). Each pooledQueryWorkerowns one pre-allocatedQueryImpl, handed out reset on borrow;submit()dispatches on the held worker (single-flight) andclose()returns it to the pool. Reads now connect at borrow time, so underlazy_connectthe read pool defaults toquery_pool_min=0andbuild()does not fail-fast while the server is down. (This is why §3's read example usesborrowQuery()and not the removedexecuteSql.)Correctness fixes carried on top of the refactor:
f39e846(M1) — stale pooled handle can corrupt the pool under double-close / use-after-close. The pooled handle was the reused per-slot object guarded by a non-volatile in-use flag, so a stale handle'sclose()/cancel()/write could leak into a different borrow (double-release → two concurrent borrowers on one non-thread-safe client). Fix: every borrow gets its own immutable generation, stamped under the pool lock and re-checked on every op —close()/cancel()no-op on a stale generation (idempotency preserved),submit()/writes throw, and release re-checks the generation under the lock so a slot can never be enqueued twice. IntroducesQueryLease(wrappingQueryImpl) andSenderSlot(withPooledSenderas the per-borrow wrapper).e30a59c— lost query-worker dispatch under single-flight reuse.runLoop()clearedcurrentoutsidesignalLock, so a fastsubmit() → await() → submit()loop could clobber a freshly dispatched job and discard its signal, hanging the caller forever. Fix: clearcurrentundersignalLockat the moment of consumption. (Surfaced as a 60s hang intestSustainedMixedConcurrency, mostly on aarch64 CI; 15/15 clean with the fix.)bcb1e7a— watchdog cancel must re-check the lease generation under the pool lock.cancel(gen)validated the generation with an unlocked volatile read then issued the wire cancel separately — a TOCTOU where a watchdog could abort a new borrower's query with a spuriousSTATUS_CANCELLED. Fix: route throughQueryClientPool.cancelIfCurrent(worker, gen), which re-checks and cancels under the same lock the generation is bumped under;close()'s abort-on-unawaited-close uses the same path.f64567e— assert the lease wraps the same pooledQueryImpl. Pins the invariant that the freshly allocated per-borrow wrapper delegates back to the one pre-allocated, reusedQueryImpl(so a regression that allocated per borrow would be caught).Testing
NetConnectTimeoutTest— loopback success, refused-vs-timeout disambiguation, black-hole timeout within budget.QuestDBFacadeCallbacksTest— facade-wirederrorHandlerreceives the async budget-exhaustionSenderError;connectionListenerobserves connection events (no server needed).QuestDBLazyConnectTest—lazy_connect=truestarts + buffers a write with the server down, keeps the read pool enabled (borrowQuery()deferred to first use, not disabled), and rejects both conflicts (blockinginitial_connect_retry, explicitquery_pool_min > 0) from the connect string and from builder calls;QuestDBServerRecoveryTestdogfoodslazy_connect=truefor the full down → write → up → read lifecycle.QueryLeaseGenerationTest/SenderLeaseGenerationTest— the per-borrow generation guard: double-release, cross-borrow cancel/write, and the concurrent stale-cancel-after-reborrow TOCTOU (testConcurrentCancelDoesNotReachClientAfterReborrow).QuestDBServerRecoveryTest— full lifecycle: server down → facade starts → client writes (buffered) → server starts → write side reconnects and the reader connects on the firstborrowQuery().QuestDBBuilderTest— covers the single cluster-config surface; the shared-vocabulary and sender-pool-unwind tests run against one server with one config.TestWebSocketServernow serves both pools from one config like a real node:SERVER_INFOis emitted only on the egress/readpath (the ingest/writeACK stream would choke on it), plus asetRejectReadUpgrade()toggle to fail only the query upgrade.Full
impl+network+ facade suites pass locally on JDK 8 (source of truth) and the surface compiles on JDK 25 (java11+ front).CI / native
ci(native): therebuild_native_libs.ymllinux-x86-64 job moved frommanylinux2014(glibc 2.17) tomanylinux_2_28(glibc 2.28), mirroring linux-aarch64 — GitHub now forces actions onto Node 24 (glibc ≥ 2.27), which couldn't run in the 2.17 container (pre-existing breakage, unrelated to the C change).Compatibility
The
connect_timeoutknob and the ingest callbacks are additive and off by default. TheQuestDBfacade is new in this PR; the legacy directSenderandQwpQueryClientAPIs are unaffected.